Foundation Guardrails
Overview
Dynamo AI provides Foundation Guardrails, a set of pre-defined, validated, and trained guardrails that cover common guardrail use cases. These Foundation Guardrails are based on expert-defined policies and Dynamo AI's ML best practices for training and evaluation.
Each Foundation Guardrail includes supporting metadata that allows an organization to review, refine, and then deploy the guardrail tailored to their specific operational use case. This consists of a policy definition, which includes a short paragraph-long description and set of behaviors that delineate allowed and disallowed content, policy research underlying the definition, a training dataset, a human-annotated benchmark dataset, and a trained model. Please contact our team to access these details.
Recommended Use
Foundation Guardrails can be immediately used after validation of their performance for a specific domain and use case. They can also be used as starting points -- enterprises can customize Foundation Guardrails to better suit their requirements and application domain. Please see the Validation section to learn more.
Foundation Guardrails Inventory
Policy | Input or Output | Summary | Category Tags |
---|---|---|---|
Prohibit Financial Advice | Input | Detects user inputs requesting financial advice. | Advice |
Prohibit Compensation Data | Input | Detects user inputs requesting sensitive compensation data. | Advice |
Prohibit Manipulation or Deceptive Language | Output | Detects manipulative and deceptive language in responses. Based on Article 5: Prohibited AI Practices, from the EU AI Act. | EU AI Act |
Prohibit Emotion Recognition | Output | Detects responses that identify or infer user emotions. Based on Article 5: Prohibited AI Practices, from the EU AI Act. | EU AI Act |
Prohibit Biometric Inference and Categorization | Input | Detects user inputs that request biometric inference or categorization of individuals or request information on how to build biometric categorization systems. Based on Article 5: Prohibited AI Practices, from the EU AI Act. | EU AI Act |
Prohibit Biometric Inference and Categorization | Output | Detects responses that apply biometrics to categorize or infer personal characteristics about individuals, or that provide information on how to do this. Based on Article 5: Prohibited AI Practices, from the EU AI Act. | EU AI Act |
Prohibit Criminal Offense Risk Assessment | Input | Detects user inputs that request criminal profiling or risk assessment of individuals. Based on Article 5: Prohibited AI Practices, from the EU AI Act. | EU AI Act |
Prohibit Criminal Offense Risk Assessment | Output | Detects responses that criminally profile or discuss the risk of individuals committing a criminal offense. Based on Article 5: Prohibited AI Practices, from the EU AI Act. | EU AI Act |
Prohibit Social Scoring | Input | Detect user inputs that request social scoring of individuals. Based on Article 5: Prohibited AI Practices, from the EU AI Act. | EU AI Act |
Prohibit Social Scoring | Output | Detect user responses that socially score individuals. Based on Article 5: Prohibited AI Practices, from the EU AI Act. | EU AI Act |
Validation
Implementation Notice
There are numerous variables that may impact the safety and compliance of an AI system, including, but not limited to: the Large Language Model (LLM) utilized to support a use case, the definition and supporting metadata of each policy guardrail, the breadth and coverage of policy guardrails selected, policy guardrail training process, and additional people, process, and technology AI governance controls.
Foundation Guardrails provide a starting point to mitigate AI risk. Each guardrail should be independently tested and validated by enterprises utilizing DynamoGuard. Validation may include appropriate business, technology, risk and control partners, and approval by the appropriate user stakeholders or governance body prior to implementation. Foundation Guardrails should not be deployed without organizational validation.
Components to Review
Each Foundation Guardrail contains 3 components for stakeholders to review and then refine as necessary.
This includes:
ID | Component | Detail | Form |
---|---|---|---|
1 | Definition Details | • Foundation Guardrail Name (a short name of the guardrail) • Foundation Guardrail Definition (a few sentences on the requirements of the guardrail) • Allowed and Disallowed Behaviors (types of allowed and disallowed prompts or responses) • Use Case (description of the domain and use case for the policy) | Document |
2 | References | • Regulatory or marketplace research references • Foundation Guardrail notes or additional guidance | Document |
3 | Benchmark Data | • Benchmark dataset of at least 100 data points of compliant and noncompliant data specific to this policy, per Dynamo AI validation • For Input policies, a prompt is provided. For Output policies, a prompt and response is provided. | Excel |
Recommended Customer Validation Steps
In order to effectively use Foundation Guardrails, users execute three key steps. First, they should review, validate, or update the policy definition. Second, they should review, validate, or update the benchmark dataset. Finally, they should test the guardrail for their specific use case. After these key steps, users should proceed to deploy, monitor, and continuously improve the policy guardrail as with any other guardrail. Dynamo AI documentation on guardrail evaluations and monitoring goes into further depth regarding best practices.
Below are further details about each step.
Step | Process | Recommendation | Output |
---|---|---|---|
1 | Review | 1) Review the Foundation Guardrail Policy definition and benchmark set with identified cross-functional stakeholders. This includes stakeholders familiar with the regulatory and policy requirements of the use case. 2) Assess the need to update the policy definition to align to the relevant use case, business, jurisdiction, size, complexity, or risk tolerance. If no change is needed, move to Step 3. | 1. A set of accountable stakeholders responsible for review and validation 2. A validated policy definition or plan to update it. |
2A | Refine Policy Definition (Optional) | 1) In the DynamoGuard platform, edit the policy definition or supporting details as necessary. 2) Given an edited policy definition, new training data will be generated for review. After review of this training data, a new custom policy model will be trained. | 1. A revised Foundation Guardrail policy definition 2. Documentation of revised policy approvals by stakeholders (if applicable) |
2B | Refine Benchmark (Optional) | 1) Outside of the DynamoGuard platform, review the benchmark data and update the labels if necessary. | 1. A revised benchmark dataset |
3 | Test | 1) Design and execute a testing plan to validate the Foundation Guardrail for the specific use case it will be deployed in. This may include identifying pertinent stakeholders to test the guardrail, obtaining adequate training data, and creating a use case specific benchmark dataset that will validate the Foundation Guardrail. 2) Based on results, potentially return to Step 2 to revise the Foundation Guardrail Policy definition and metadata. 3) Document the results, share for approvals, and receive approval. | 1. Documented testing results 2. Documented approvals by stakeholders for deployment of the Foundation Guardrail. |
4 | Deploy | 1) Deploy Foundation Guardrail (finalized) into production |